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""’PTIm®  Failure  Detection  and  Identification  (FDI)  process  is  viewed  as  consisting  of 
two  stages  :  residual  generation  and  decision  making.  It  is  argued  that  a  robust  FDI 
system  can  be  achieved  by  designing  a  robust  residual  veneration  process.  Analytical 
redundancy,  the  basis  for  residual  generation,  is  characterized  in  terms  of  a  parity 
space.  Using  the  concept  of  parity  relations,  residuals  can  be  generated  in  a  number 
of  ways  and  the  design  of  a  robust  residual  generation  process  can  be  formulated  as  a 
minimax  optimization  problem.  An  example  is  included  to  illustrate  this  design 
methodology. 
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L  INTRODUCTION 

Physical  systems  are  often  subjected  to  unexpected  changes,  such  as  component 
failures  and  variations  in  operating  condition,  that  tend  to  degrade  overall  system 
performance.  We  will  refer  to  such  changes  as  "failures*,  although  they  may  not 
represent  the  failing  of  physical  components.  In  order  to  maintain  a  high  level  of 
performance,  it  is  important  that  failures  be  promptly  detected  and  identified  so  that 
appropriate  remedies  can  be  applied.  Over  the  past  decade  numerous  approaches  to 
the  problem  of  Failure  Detection  and  Identification  (FDI)  in  dynamical  systems  have 
been  developed  [1];  Detection  Filters  [2,31,  the  Generalized  Likelihood  Ratio  (GLR) 
method  [4,5],  and  the  Multiple  Model  method  [5,61  are  some  examples.  All  of  these 
analytical  methods  require  that  a  dynamic  model  of  some  sort  be  given.  The  goal  of 
this  paper  is  to  investigate  the  issue  of  designing  FDI  systems  which  are  robust  to 
uncertainties  in  the  models  on  which  they  are  based. 

The  FDI  process  essentially  consists  of  two  stages  :  residual  generation  and 
decision  making.  For  a  particular  set  of  hypothesized  failures,  a  FDI  system  has  the 
structure  shown  in  Figure  1.  Outputs  from  sensors  are  initially  processed  to  enhance 
the  effect  of  a  failure  (if  present)  so  that  it  can  be  recognized.  The  processed 
measurements  are  called  the  residuals ,  and  the  enhanced  failure  effect  on  the  residuals 
is  called  the  signature  of  the  failure.  Intuitively,  the  residuals  represent  the  difference 
between  various  functions  of  the  observed  sensor  outputs  and  the  expected  values  of 
these  functions  in  the  normal  (no-fail)  mode.  In  the  absence  of  a  failure  residuals 
should  be  unbiased,  showing  agreement  between  observed  and  expected  normal 
behavior  of  the  system;  a  failure  signature  typically  takes  the  form  of  residual  biases 


that  are  characteristic  of  the  failure.  Thus,  residual  generation  is  based  on  knowledge 
of  the  normal  behavior  of  the  system.  The  actual  process  of  residual  generation  can 
vary  in  complexity.  For  example,  in  voting  systems  [7,81  the  residuals  are  simply  the 
differences  of  the  outputs  of  the  various  like  sensors,  whereas  in  a  GLR  system,  the 
residuals  are  the  innovations  generated  by  the  Kalman  filter. 

In  the  second  stage  of  an  FDI  algorithm,  the  decision  process,  the  residuals  are 
examined  for  the  presence  of  failure  signatures.  Decision  functions  or  statistics  are 
calculated  using  the  residuals,  and  a  decision  rule  is  then  applied  to  the  decision 
statistics  to  determine  if  any  failure  has  occurred.  A  decision  process  may  consist  of  a 
simple  threshold  test  on  the  instantaneous  values  or  moving  averages  of  the  residuals, 
or  it  may  be  based  directly  on  methods  of  statistical  decision  theory,  e.g.  the 
Sequential  Probability  Ratio  Test  [9]. 

The  first  concern  in  the  design  of  an  FDI  system  is  detection  performance,  i.e.  the 
ability  to  detect  and  identify  failures  promptly  and  correctly  with  minimal  delays  and 
false  alarms.  In  the  literature,  this  issue  has  typically  been  dealt  with  using  a  given 
model  of  the  normal  system  behavior.  An  equally  important  design  issue  that  is 
necessarily  examined  in  practice  but  has  received  little  theoretical  attention  is 
robnstness:  minimizing  the  sensitivity  of  detection  performance  to  model  errors  and 
uncertainties.  An  ideal  simplistic  approach  to  designing  a  robust  FDI  system  is  to 
include  all  uncertainties  in  the  overall  problem  specification;  then  a  robust  design  is 
obtained  by  optimizing  (in  some  sense)  the  performance  of  the  entire  system  with  the 
uncertainties  present.  However,  this  generally  leads  to  a  complex  mathematical 
problem  that  is  too  difficult  to  solve  in  practice.  On  the  other  hand,  a  simple  approach 
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it  to  ignore  ail  model  uncertainties  in  the  performance  optimization  process.  The 
resulting  design  is  then  evaluated  in  the  presence  of  modelling  errors.  If  the 
degradation  in  performance  is  tolerable,  the  design  is  accepted.  Otherwise,  it  is 
modified  and  re-evaluated.  Although  this  method  often  yields  acceptable  designs,  it 
has  several  drawbacks.  For  example,  it  may  be  unclear  what  parts  of  the  design 
should  be  modified  and  what  form  the  modification  should  take.  Furthermore,  each 
iteration  may  be  very  expensive  to  carry  out  since  extensive  Monte  Carlo  simulations 
are  often  required  for  performance  evaluations. 

In  this  paper  we  develop  a  systematic  approach  that  considers  uncertainties  directly. 
Our  work  is  motivated  by  the  practical  design  effort  of  Deckert,  et.  al.  for  an  aircraft 
sensor  FDI  system  [10].  The  basic  idea  used  in  this  work  was  to  identify  the 
analytical  redundancy  relations  of  the  system  that  were  known  well  and  those  that 
contained  substantial  uncertainties.  An  FDI  system  (i.e.  its  residual  generation 
process)  was  then  designed  based  primarily  on  the  well-known  relationships  (and  only 
secondarily  on  the  less  well-known  relations)  of  the  system  behavior.  As  model  error 
directly  affect  residual  generation,  this  approach  suggests  that  robustness  can  be 
effectively  achieved  by  designing  a  robust  residual  generation  process.  In  our  work, 
we  have  extracted  and  extended  the'  practical  idea  underlying  this  application  and 
developed  a  general  approach  to  the  design  of  robust  FDI  algorithm.  In  addition  to  its 
use  in  specifying  residual  generation  procedure,  our  approach  is  also  useful  as  it  can 
provide  a  quantitative  measure  of  the  attainable  level  of  robustness  in  the  early  stages 
of  a  design.  This  can  allow  the  designer  to  assess  what  he  can  expect  in  terms  of 
overall  performance. 
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In  order  to  develop  residual  generation  procedures,  it  is  important  to  identify  the 
redundancy  relations  of  a  system  and  to  characterize  them  according  to  how  they  are 
affected  by  model  errors  and  uncertainties.  In  this  paper,  we  further  develop  the 
concept  of  analytical  redundancy  that  is  used  in  [10,11],  and  we  use  this  as  a  basis  for 
determining  redundancy  relations  to  be  used  for  residual  generation  which  are  least 
sensitive  to  model  errors. 

In  Section  n  we  describe  the  concept  of  analytical  redundancy  and  present  a 
mathematical  characterization  of  redundancy  in  linear  dynamical  systems  that  extends 
ideas  developed  previously.  We  also  provide  for  the  first  time  a  clear,  general 
interpretation  of  a  redundancy  relation  as  a  reduced-order  Auto-Regressive-Moving- 
Average  (ARM  A)  model  and  use  this  in  Section  m  to  describe  the  various  ways  that 
analytical  redundancy  can  be  used  for  residual  generation  and  FDI.  In  Section  IV  a 
method  of  determining  redundancy  relations  that  are  least  sensitive  to  model  enor 
and  noise  effects  is  described.  A  numerical  example  illustrating  some  of  the 
developed  concepts  is  presented  in  Section  V.  Conclusions  and  discussions  are 


included  in  Section  VI. 


H.  ANALYTICAL  REDUNDANCY  -  PARITY  RELATIONS 


The  basis  for  residual  generation  is  analytical  redundancy,  which  essentially  takes 
two  forms  :  1)  direct  redundancy  -  the  relationship  among  instantaneous  outputs  of 
sensors,  and  2)  temporal  redundancy  -  the  relationship  among  the  histories  of  sensor 
outputs  and  actuator  inputs.  It  is  based  on  these  relationships  that  outputs  of 
(dissimilar)  sensors  (at  different  times)  can  be  compared.  The  residuals  resulting 
from  these  comparisons  are  then  measures  of  the  discrepancy  between  the  behavior  of 
observed  sensor  outputs  and  the  behavior  that  should  result  under  normal  conditions. 
Examples  where  direct  redundancy  was  exploited  include  [7,8,11,12,13];  explicit  use 
of  temporal  redundancy  was  made  in  [10]. 

In  order  to  develop  a  clear  picture  of  redundancy,  consider  the  following 
deterministic  model: 

x(k+l)  -  A  x(k)  +  £  b:  u.(k) 

J-i  (la) 


yj(k)  -  Cj  x(k)  ,  j-1,...,  M 


(lb) 


where  x  is  the  N -dimensional  state  vector,  A  is  a  constant  N  x  N  matrix,  bj  is  a 
constant  column  N -vector,  and  Cj  is  a  constant  row  N-vector.  The  scalar  uj  is  the 
known  input  to  the  j-th  actuator,  and  the  scalar  y;  is  the  output  of  the  j-th  sensor. 


Direct  redundancy  exists  among  sensors  whose  outputs  are  algebraically  related, 
l.e.  the  sensor  outputs  are  related  in  such  a  way  that  the  variable  one  sensor  measures 
can  be  determined  by  the  instantaneous  outputs  of  the  other  sensors.  For  the  system 
(1),  this  corresponds  to  the  situation  where  a  number  of  the  Cj’s  are  linearly 
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dependent.  In  this  case,  the  value  of  one  of  the  observations  can  be  written  as  a 
linear  combination  of  the  other  outputs.  For  example,  we  might  have 

yj(k)  -  X  «i  *<k) 

i-2  (2) 

where  the  «j’s  are  constants.  This  indicates  that  under  normal  conditions  the  the  ideal 

output  of  sensor  1  can  be  calculated  from  those  of  the  ?smaining  sensors.  In  the 

M 

absence  of  a  failure  in  the  sensors,  the  residual,  yj(k)- Ja^fk)  should  be  zero.  A 

i-2 

deviation  form  this  behavior  provides  the  indication  that  one  of  the  sensors  has  failed. 
This  is  the  underlying  principle  used  in  Strapdown  Inertial  Reference  Unit  (SIRU) 
FDI  [7,8].  Note  that  while  direct  redundancy  is  useful  for  sensor  failure  detection  it  is 
not  useful  for  detecting  actuator  failures  (as  modelled  by  a  change  in  the  bj,  for 
instance). 

Because  temporal  redundancy  relates  sensor  output  and  actuator  inputs,  it  can 
potentially  be  used  for  both  sensor  and  actuator  FDI.  For  example,  consider  the 
relationship  between  velocity  (v)  and  acceleration  (a)  : 


v(k+l)  -  v(k)  +  Ta(k) 

(3) 

where  T  is  the  sampling  interval.  Equation  (3)  prescribes  a  way  of  comparing  velocity 
measurements  and  accelerometer  outputs  (by  checking  to  see  if  the  residual, 
v(k+  l)-v(k)-Ta(k),is  zero)  that  may  be  used  in  a  mixed  velocity-acceleration  sensor 
voting  system  for  the  detection  of  both  types  of  sensor  failures.  Temporal  redundancy 
facilitates  the  comparison  of  sensors  among  which  direct  redundancy  does  not  exist. 
Hence  it  can  lead  to  a  reduction  of  hardware  redundancy  for  sensor  FDI.  Viewed  in  a 
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different  light,  the  use  of  analytical  redundancy  implies  that  additional  sensor  failures 
can  in  principle  be  detected  with  the  same  level  of  hardware  redundancy. 


To  see  how  temporal  redundancy  can  be  exploited  for  detecting  actuator  failures, 
let  us  consider  a  simplified  first-order  model  of  a  vehicle  in  motion  : 


v(k+l)  -  av(k)  +  T  u(k) 

(4) 

where  v  denotes  the  vehicle’s  velocity,  a  is  a  scalar  constant  between  zero  and  one 
reflecting  the  effect  of  friction  and  drag.  T  is  the  sampling  interval,  and  u  is  the 
commanded  engine  force  (actuator  input)  divided  by  the  vehicle's  mass.  Now  the 
velocity  measurements  can  be  compared  to  the  actuator  inputs  by  means  of  (4),  i.e. 
through  examining  the  residual  v(k+  l)-av(k)-Tu(k).  An  actuator  failure  can  be 
inferred,  if  the  sensor  is  functioning  normally  but  the  residual  is  nonzero. 


While  the  additional  information  supplied  by  dissimilar  sensors  and  actuators  at 
different  times  through  temporal  redundancy  facilitates  the  detection  of  a  greater 
variety  of  failures  and  reduces  hardware  redundancy,  exploitation  of  this  additional 
information  often  results  in  increased  computational  complexity,  since  the  dynamics  of 
the  system  are  used  in  the  residual  generation  process.  However,  the  major  issue  in 
the  use  of  analytical  redundancy  is  the  inevitable  uncertainty  in  our  knowledge  of  the 
system  dynamics  (e.g.  of  a  in  (4))  and  the  consequences  of  the  this  uncertainty  on 
the  robustness  of  the  resulting  FDI  algorithm.  From  the  above  discussion  one 
approach  to  the  design  of  robust  residual  generation  in  any  given  application  is 
evident:  first,  the  various  redundancies  that  are  relevant  to  the  failures  under 
consideration  are  to  be  determined;  then,  residual  generation  is  based  on  those 
relations  that  are  least  sensitive  to  parameter  uncertainties.  This  is  the  approach  we 


^ m 
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have  adopted.  In  the  remainder  of  this  section  we  will  present  a  characterization  of 
analytical  redundancy  and  in  a  subsequent  section  we  will  quantify  the  effect  of 
uncertainties  on  a  redundancy  relation. 


The  Generalized  Parity  Space 


Let  us  define 


Cj(k)  - 


cj 

C|A 


CiAk 


k-0,  1,... 
j-1 . M 


(5) 


The  well-known  Cayley-Hamilton  theorem  {14]  implies  that  there  is  an  nj,  l^n^N, 
such  that 


rankC(k)  -  k+1 
i  nj 


k<nj 

k^nj 


(6) 


The  null  space  of  the  matrix  Cj(nj-l)  is  known  as  the  unobservable  subspace  of  the 
j-th  sensor.  The  rows  of  Cj(nj-l)  spans  a  subspace  of  RN  that  is  the  orthogonal 
complement  of  the  unobservable  subspace.  Such  a  subspace  will  be  referred  to  as  the 
observable  subspace  of  the  j-th  sensor,  and  it  has  dimension  nj. 

Let  •  be  a  row  vector  of  dimension  n-JOij+l)  such  that  «-[«', . . . 

i-i 

where  j“  1,...,M,  is  a  (nj+ l)-dimensiona]  row  vector.  Consider  a  nonzero  « 
satisfying 


.10. 


Cj  (hj) 


x(k)  -  0,  x(k)«  Rn 


(7) 


Note  that  in  the  above  equation  Cj(nj)  has  Qj+1  rows  while  it  is  only  of  rank  n^.  The 
reason  for  this  will  become  clear  when  we  discuss  the  temporal  redundancy  for  a 
single  sensor.  Assuming  that  the  system  (1)  is  observable,  there  are  only  n-N  linearly 
independent  »’s  satisfying  (7).  We  let  O  be  an  (n-N)xN  matrix  with  a  set  of  such 
independent  «’s  as  its  rows.  (The  matrix  Cl  is  not  unique.)  Assuming  all  the  inputs 
are  zero  for  the  moment,  we  have  : 


Y,(M,) 


P(k)-  a 


« 


(8) 


where 


Yjfk.n,)  - 


Jj<« 


yjfk+nj) 


,  j-1 . M 


Note  that  Equation  (8)  is  independent  of  the  state  x(k).  The  (n-N) -vector  P(k)  is 
called  the  parity  vector.  In  the  absence  of  noise  and  failures,  P(k)  —  0.  In  the  noisy 
no-fail  case,  P(k)  is  a  zero-mean  random  vector.  Under  noise  and  failures,  P(k)  will 
become  biased.  Moreover,  different  failures  will  produce  different  (biases  in  the) 
P(k) ’s.  Thus,  the  parity  vector  may  be  used  as  the  signature-carrying  residual  for 
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FDI.  We  will  further  discuss  residual  generation  based  on  parity  equations  in  Section 

m. 


When  the  actuator  inputs  are  not  zero,  (8)  must  be  modified  to  take  into  account 
this  effect.  In  this  case 


YjfM,) 

B,(n,) 

p(k)  -  n 

• 

— 

• 

U  (k,n0) 

Ym  (Mm) 

(9) 


where 


0  0 

CjB  0 


Bj(nj) 


0 


CjA^’B  CjA"r2B  .  .  CjB  '  0 

B  “  [bj,  .  .  .  ,  bqJ 


u(k)  -  lu,(k),..M  uq(k)]' 


no  -  max(nt,  ....  nM) 


U(k,n0)  -  tu'(k) . u'(k+no)l' 

Bj(nj)  is  an  (nj+1)  x  n0q  matrix  (  q  is  the  number  of  actuators).  Note  that  Equation 
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(9)  only  involves  the  measurable  inputs  and  outputs  of  the  system,  and  it  does  not 
depend  on  the  state  x(k)  which  is  not  directly  measured. 

The  quantity  P(k)  is  known  as  the  generalized  parity  vector ,  which  is  nonzero  (or 
non-zero  mean  if  noise  is  present)  only  if  a  failure  is  present.  The  (n-N)  dimensional 
space  of  all  such  vectors  is  called  the  generalized  parity  space.  Under  the  no-fail 
situation  (P(k)  *0),  (9)  characterizes  all  the  analytical  redundancies  for  the  system 
(1),  because  it  specifies  all  the  possible  relationships  among  the  actuator  inputs  and 
sensor  outputs.  Any  linear  combination  of  the  rows  of  (9)  is  called  a  parity  equation  or 
a  parity  relatione,  any  linear  combination  of  the  right-hand  side  of  (9)  is  called  a  parity 
Junction.  Equation  (6)  implies  that  we  do  not  need  to  consider  a  higher  dimensional 
parity  space  that  is  defined  by  (9)  with  Oj  replaced  by  lj > n j ,  j»l,...,M,  although  it  is 
possible  to  do  so.  We  note  that  the  generalized  parity  space  we  have  just  defined  here 
is  an  extension  of  the  parity  space  considered  by  Potter  and  Suman  [11]  to  include 
sensor  outputs  and  actuator  inputs  at  different  times.  In  [11],  Potter  and  Suman 
studied  exclusively  (9)  with  n^nj-  •  •  •  -0. 

A  useful  notion  in  describing  analytical  redundancy  is  the  order  of  a  redundancy 
relation.  Consider  a  parity  relation  (under  the  no-fail  condition)  defined  by 

£  J  lYjtMj)  -  Bj(nj)U(k,n0)l  -  0 

J-t  (10) 

*  We  can  define  the  order  p  of  such  a  relation  as  follows.  Since  some  of  the  elements 
of  m  may  be  zero,  there  is  a  largest  index  n  such  that  the  n-th  element  of  <*'  for  some 
i  is  nonzero  but  the  (h+l)-st  through  the  (nj+D-st  elements  of  each  are  zero. 
Then  p  is  defined  to  be  ft-1.  The  order  p  describes  the  'memory  span*  of  the 
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redundancy  relation.  For  example,  when  p-0,  instantaneous  outputs  of  sensors  are 
involved.  When  p>0,  a  time  window  of  size  p+ 1  of  sensor  outputs  and  actuator 

.  !v  *d 

inputs  are  considered  in  the  parity  equation.  For  example,  (3)  is  a  first  order  parity 
relation. 


To  provide  more  insights  into  the  nature  of  parity  relations,  it  is  useful  to  examine 
several  examples. 

1.  Direct  Redundancy 

;  i  • 

Suppose  there  are  <J's  of  the  form 

-  i<4 ,  o,...,  o  l 


where  at  least  two  of  the  <4's  are  nonzero,  and  they  satisfy  Equation  (7).  Tljen  we 
have  the  following  direct  redundancy  relation 


yiOO 


0 


Note  that  the  above  expression  represents  a  zeroth  order  parity  equation. 

2.  A  Single  Sensor 

Equation  (6)  implies  that  it  is  always  possible  to  find  a  nonzero  <J  such  that 


-  Bj(aj)U0ttno)J  -  0 


(11) 


Note  that  Equation  (I  I)  is  of  order  Oj,  and  it  is  a  special  case  of  (10).  (This  is  why 
we  have  used  Oj  instead  of  Oj— 1  in  (7)  in  order  to  include  this  type  of  temporal 
redundancy.)  Since  this  redundancy  relation  invplvfs  only  one  sensor  the  parity 
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function  defined  by  the  left-hand  side  of  (11)  may  be  used  as  the  residual  for  a 
self-test  for  sensor  j,  if  Bjfnp-O  or  if  the  actuators  can  be  verified  (by  other 
means)  to  be  functioning  properly.  Similarly,  it  can  be  used  to  detect  actuator 
failures  if  sensor  j  can  be  verified  to  be  norma).  Equation  (4)  (in  which  v(k)  is 
directly  measured)  represents  an  example  of  this  type. 

Alternatively  (11)  can  be  re-written  as 

yj(k)  -  -(«i)_1  2  «l-i  yj(k-0  "  2  ^iL-!  uOt-O 

t-i  »-i  (12) 

where 

.  .  .  ,  0,...,  0]  -  t^Bjfaj) 

<r/,  t-0,...,nj-l,  is  a  q-dimensional  row  vector,  and  t-O,...,^-!  is  the  (t+l)-st 
component  of  <j.  Equation  (12)  represents  a  reduced-order  ARMA  model  for  the 
j-th  sensor  alone.  That  is  to  say,  the  output  of  sensor  j  can  be  predicted  from  its 
past  outputs  and  past  actuator  inputs  according  to  (12).  Based  on  the  ARMA 
model  several  methods  of  residual  generation  are  possible.  We  will  discuss  this 
further  in  Section  III. 

3.  Temporal  Redundancy  Between  Two  Sensors 

A  temporal  redundancy  exists  between  sensor  i  and  sensor  j  if  there  are 

»'  -  («q,  . .  .  , 

J  -  l«J,  ....  «iri,0l 


satisfying  the  redundancy  relation 


5  - 


W,J) 


Yi(k,n,) 

[Yjdc.iij) 


BiOc,^) 

Bjdc.Uj) 


U(k,n0) 


0 


(13) 


Equation  (13)  is  a  special  case  of  the  general  form  of  parity  equation  (10)  in  the 
no-fail  situation  with  ws-0  for  s^i,  s^j.  The  relation  (13)  is  of  order 
p<max(nj,nj).  Clearly,  (13)  holds  if  and  only  if 

l«©»  ....  w^-ilCjfnj-l)  —  [«£,  .  .  .  ,  «^_|lCj(nj-l) 

(14) 

and,  (14)  implies  that  a  redundancy  relation  exists  between  two  sensors  if  their 
observable  subspaces  overlap.  Furthermore,  when  the  overlap  subspace  is  of 
dimension  n,  there  are  n  linearly  independent  vectors  of  the  form  [ «',«*]  that  will 
satisfy  (13).  Note  that  (3)  (with  both  v(k)  and  a(k)  measured)  represents  an 
example  of  this  type. 

Because  the  order  of  (13)  is  p,  either  or  <*'p  must  be  nonzero.  Assuming 
we  can  re-write  (14)  in  an  ARM  A  representation  for  sensor  j  as  in  (12) 

y(k)--(«i)-‘  £  «i.,yj(k-t)+  i«;_,yi(k-t)-  £  (<r;_l  +  <rj_l)u(k-t) 

Jt-l  t-0  t-0 

That  is,  the  parity  relation  (13)  specifies  an  ARMA  model  for  the  jtb  sensor,  with 
the  original  system  input  u  and  the  ith  sensor  output  acting  as  inputs  to  this 
reduced  order  model.  In  general,  any  parity  relation  specifies  an  ARMA  model  for 
some  sensor  driven  by  u  and  by  possibly  all  of  the  other  sensor  outputs. 


m.  RS8IDUAL  OINKBATION  FOR  FDI 


In  the  first  part  of  this  section  we  discuss  alternative  residual  generation 
procedures,  and  in  the  latter  half  of  the  section  we  discuss  how  such  residuals,  once 
generated,  can  be  used  for  failure  detection.  Our  development  in  this  section  section 
will  be  carried  out  in  terms  of  a  second  order  system  (N=*2)  in  the  form  of  (1)  with 
the  following  parameters. 


*!I  *12 

0  aij 


(15) 


ci  -  II  0) 
c2  -  [0  1 1 


In  this  case  nj-2,  n2-l,  and  n-N-»3.  Therefore,  there  are  only  three  linearly 
independent  parity  equations  which  may  be  written  as 


y,(k)  -  (a„+a22)y1(k-l)  +ana22y1  (k-2)  -a12u(k-2)  -  0 


y1(k)-any1(k-l)-a|2y2(k-l)  -  0 

y2(k) - a22y2(k-l)  -  u(k-l)  -  0 
* 

Note  that  these  represent  temporal  redundancies. 

Residual  Generation  Based  on  Parity  Relations 


(16) 


For  a  zeroth  order  parity  relation  (i.e.  a  direct  redundancy  relation)  the  residual  is 
the  corresponding  parity  function.  For  a  higher  order  parity  relation  (temporal 
redundancy),  there  are  three  possible  methods  for  the  residual  generation.  We  will 
illustrate  these  using  the  second  parity  equation  of  (16). 


I 


■4 
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1.  Parity  Function  as  Residuals 

Just  as  with  direct  redundancy  relations,  the  parity  function  itself  can  be  taken 
as  a  residual.  For  our  specific  example,  this  would  be 

tj(k)  -  yj  (k)  -  a, ,  y  |  (k- 1 )  -  aj  2y2  (k- 1 ) 

(17) 

Such  a  residual  is  a  moving  average  process,  i.e.  it  is  a  function  of  a  sliding  window 
of  the  most  recent  sensor  output  and  (possibly)  actuator  input  values.  It  is  useful 
to  note  the  effect  of  noise  and  failures  on  the  residual.  Specifically,  if  the  sensor 
outputs  are  corrupted  by  white  noise,  the  parity  function  values  will  be  correlated 
over  the  length  of  the  window.  In  our  example,  r,  (k)  is  correlated  with  r,  (k-1) 
and  rj(k+l)  but  not  with  any  of  its  values  removed  by  more  than  one  time  step  . 

The  effect  of  a  failure  on  a  parity  function  depends,  of  course,  on  the  nature  of 
the  failure.  To  illustrate  what  typically  occurs,  consider  the  case  in  which  one 
sensor  develops  a  bias.  Since  the  parity  function  is  a  moving  average  process  it 
also  develops  a  bias,  taking  at  most  p  steps  to  reach  the  steady  state  value.  In  our 
example,  if  y2(k)  develops  a  bias  of  size  0  at  time  0,  r2  (k)  will  have  a  bias  of  size 
-a22/3  from  time  0+1  on. 

2.  Open-Loop  Residuals 


As  discussed  in  the  preceding  section,  any  temporal  redundancy  relation 
specifies  an  ARMA  model.  In  our  example  we  have  the  model 


yi(k)-aj,yi(k-l)+a|2y2(k-l) 

(18) 

This  equation  leads  naturally  to  a  second  residual  generation  procedure:  solve 
equation  (18)  recursively  using  as  initial  condition  the  actual  initial  value  of  the 
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first  sensor  output  and  then  using  the  actual  value  of  the  second  sensor  in  the 
recursion;  compare  the  result  at  each  instant  of  time  to  the  actual  output  of  sensor 
1.  That  is  ,  we  compute 


y,(k)  -an  y^k-l)  +a,2y2(k-l) 


with 


(19) 


yj  (0)  -  y,  (0) 


and  the  resulting  residual  is 


r2(k)  •  y,(k)  -  yt(k) 


The  behavior  of  this  residual  is  decidedly  different  from  that  of  r|(k).  In 
particular,  r2(k)  is  not  a  moving  average  of  previous  values  as  it  involves  the 
integration  of  y2(k).  Thus,  if  y|(k)  and  y2(k)  are  corrupted  by  white  noise,  r2(k) 
will  in  general  be  correlated  with  all  of  its  preceeding  and  future  values.  For 
example,  if  ajj-l,  r2(k)  is  nothing  but  a  random  walk. 

The  effect  of  failure  is  also  different  in  r2(k).  For  example,  if  y2(k)  develops  a 
bias,  this  bias  will  be  integrated  in  (19).  In  particular,  if  a,,-l,  r2(k)  will  develop 
a  ramp  of  slope  -a^  at  time  time  0+1  if  sensor  2  develops  a  bias  of  size  fi  at 

time  9. 

3.  Closed-Loop  Residuals 

A  third  method  of  residual  generation  is  also  based  on  the  ARM  A  model  (18), 


but  explicitly  taking  noise  into  account.  Specifically,  we  write  each  sensor  output 
as  its  noise  free  value  plus  noise: 


yj(k)  -  yjo(k)  +  vs(k) 


(20) 


Then,  from  (18)  we  obtain  the  equation 

ylo(k)  -  a„y,0(k-l)  +  aI2y2(k-l)  -  a12v2(k-l) 

(21) 

Note  that  the  known  driving  term  here  is  the  actual  sensor  output,  and  thus  the 
noise  on  this  output  becomes  a  driving  noise  for  the  model  (21).  Given  this 
model  and  the  noisy  measurement  y{(k)  of  yt0(k)  we  can  design  a  Kalman  filter 

y,0(k)  -  a,,y,0(k-l)  +  a,2y2(k-l)  +  Hr3(k) 
where  H  is  the  Kalman  gain  and  the  residual  is  the  innovations 
r3(k)  -  y, (k)  -  a„y,,(k-l)  -  a,2y2(k-l) 

In  this  case,  r3(k)  is  an  uncorreiated  sequence.  Also,  if  y2(k)  develops  a  bias  at 
time  9 ,  the  trend  in  r3(k)  will  be  time-varying.  Specifically,  it  will  begin  at  time 
til  as  i  ramp,  but  will  level  off  to  a  steady  state  bias  due  to  the  closed-loop 
nature  of  the  the  residual  generation  process. 

All  three  of  these  residual  generation  procedures  have  been  used  in  practice.  For 
example,  parity  functions  have  found  many  applications,  ranging  from  gyro  failure 
detection  [7,8]  to  the  validation  of  signals  in  nuclear  plants  [13].  The  open-loop 
method  was  used  in  detecting  sensor  failures  on  the  F-8  aircraft  [10],  as  was  the 
closed-loop  method,  which  has  also  been  used  in  such  applications  as 
electrocardiogram  analysis  [6]  and  manuever  detection  [15],  Our  contribution  here  is 
to  expose  the  fundamental  relationships  among  these  in  general. 
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At  first  glance,  it  might  seem  that  the  closed-loop  method  is  the  logical  method  to 
use  in  that,  if  the  sensor  noise  is  white,  it  produces  an  uncorrelated  sequence  of 
residuals  rather  than  a  correlated  one  that  would  have  to  be  whitened  in  an  'optimal* 
defection  system.  In  fact,  going  one  step  further,  it  would  seem  decidedly  suboptimal 
to  use  only  one  or  several  redundancy  relations  rather  than  all  of  them.  That  is,  the 
"optimal*  approach  would  seem  to  be  designing  a  Kalman  filter  based  on  the  entire 
model  (1).  This,  however,  is  true  only  in  the  most  ideal  of  worlds  in  which  our 
knowledge  of  the  system  dynamics  is  perfect.  When  model  uncertainties  are  taken 
into  account  it  is  not  at  all  clear  that  this  is  what  one  should  do.  Rather,  it  would 
seem  reasonable  to  identify  only  the  most  robust  redundancy  relations  and  then  to 
structure  failure  detection  systems  based  on  these.  This  leads  to  two  obvious 
questions: 

1.  How  does  one  define  and  determine  robust  redundancy  relations  ? 

2.  Given  a  set  of  such  relations,  how  does  one  use  them  in  concert  in  designing  a 
failure  detection  system  ? 

In  the  remainder  of  this  section  we  discuss  the  second  of  these  questions,  while  the 
first  is  addressed  in  the  next  section.  Throughout  these  developments  we  will  focus 
on  using  the  first  (i.e.  the  parity  function)  method  of  residual  generation,  as  this  is  the 
simplest  analytically  while  allowing  us  to  gain  considerable  insight  and  develop  some 
very  useful  techniques  for  robust  failure  detection. 

Use  of  Parity  Functions  In  a  Failure  Detection  System 

Now  we  discuss  how  the  residuals  generated  using  parity  functions  can  be  used  for 


-21  - 


failure  detection.  In  this  discussion  we  will  not  be  concerned  with  the  detailed 
decision  process,  which  would  involve  specific  statistical  tests,  but  we  will  focus  on  the 
geometry  of  the  failure  detection  problem.  First  we  will  examine  (sensor)  FDI  using 
direct  redundancy.  This  is  the  case  that  has  been  examined  in  most  detail  in  the 
literature,  for  example,  in  the  work  of  Evans  and  Wilcox  [7],  Gilmore  and  McKern 
[8],  Potter  and  Suman  [11],  Daley,  et.  al.  112],  and  Desai  and  Ray  [13].  We  include 
this  brief  discussion  of  concepts  developed  by  others  in  order  to  provide  for  a  basis  for 
our  discussion  of  their  extention  to  include  temporal  redundancy  relations. 

Consider  a  set  of  M  sensors  with  output  vector  y(k)-[y,(k),...,yM(k)]'  and  a  parity 
vector 

P(k)  -  ft  y(k) 

(22) 

where  ft  is  a  matrix  with  M  columns  and  a  number  of  rows  (the  specification  of 
which  will  be  discussed  later).  From  Section  II,  we  see  that  ft  is  not  unique,  and  for 
any  choice  of  ft  such  that  (22)  is  a  parity  vector,  we  know  that  P(k)  will  be  zero  in 
the  absence  of  a  failure  (and  no  noise).  However,  the  nature  of  failure  signatures 
contained  in  the  parity  vector  depends  heavily  on  the  choice  of  ft .  Clearly  ft  should 
be  chosen  so  that  failure  signatures  are  easily  recognizable.  In  the  following  we  will 
describe  two  approaches  for  achieving  this  purpose. 

One  way  of  using  the  parity  vector  for  FDI  is  via  what  we  term  a  voting  scheme.  To 
implement  the  voting  scheme,  we  need  a  set  of  parity  relations  such  that  each 
component  (i.e.  sensor  or  actuator)  of  interest  is  included  in  at  least  one  parity 
relation  and  each  component  is  excluded  from  at  least  one  parity  relation.  When  a 
component  fails,  all  the  parity  relations  involving  it  will  be  violated1,  while  those 
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excluding  it  will  still  hold.  This  means  that  the  components  involved  in  parity 
relations  that  hold  can  be  immediately  declared  as  unfailed,  while  the  component  that 
is  common  to  all  violated  parity  relations  is  readily  identified  as  failed.  This  is  the 
basic  idea  of  voting  that  is  used  in  [7,81.  In  fact,  for  the  detection  and  identification 
of  a  single  failure  among  M  components  at  least  M-l  parity  relations  are  required2. 
Therefore,  the  number  of  rows  in  A  should  be  at  least  M-l,  and  the  rows  of  A 
should  be  chosen  to  satisfy  the  above  criterion  on  the  set  of  parity  relations. 
Furthermore,  we  note  that  at  least  three  components  are  needed  for  voting  and  that  it 
may  not  be  possible  to  determine  a  required  A  in  many  applications,  in  which  case 
the  use  of  temporal  redundancy  is  absolutely  necessary. 

Another  method  which  uses  more  information  about  how  failures  affect  the 
residuals  has  been  examined  by  Potter  and  Suman  fill,  and  Daley,  et.  a  1.  fill.  This 
method  exploits  the  following  phenomenon.  A  faulty  sensor,  say  the  j-th  one, 
contains  an  error  signal  v(k)  in  its  output 


yj(k)  -  c,  x(k)  +  v(k) 

The  effect  of  this  failure  on  the  parity  vector  defined  by  (22)  is 


(23) 


P(k)  -  Aj  v(k) 

where  Aj  is  the  j-th  column  of  A.  That  is,  no  matter  what  v (k)  is,  the  effect  of  a 

1.  "Violation*  cm  be  defined  io  a  variety  of  ways.  Typically,  ooe  compares  the  residual  value  to  a 
threshold  determined  by  some  means  (e.g.  ooe  may  use  a  statistical  criterion  to  set  the  threshold  to 
achieve  a  specified  false  alarm-  correct  detection  tradeoff).  Alternatively,  ooe  may  use  the  average 
of  the  residual  over  a  sliding  window  to  improve  the  tradeoff. 

2.  The  logic  used  here  has  to  be  modified  slightly.  If  each  of  the  M-l  components  is  excluded  from  a 
different  parity  relation  and  the  remaining  component  is  involved  io  all  parity  relations,  then 
violation  of  all  parity  relations  indicates  the  failure  of  this  last  component,  and  failures  in  the  other 
components  can  be  identified  using  the  above  logic.  In  practice,  more  than  M-l  relations  are 
preferred  for  better  performance  in  noise. 
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sensor  j  failure  on  the  residual  always  lies  in  the  direction  A  j.  Thus,  a  sensor  j  failure 
can  be  identified  by  recognizing  a  residual  bias  in  the  Aj  direction.  We  refer  to  Aj  is 
the  Failure  Direction  in  Parity  Space  (FDPS)  corresponding  to  sensor  j.  (In  [11]  A;  is 
referred  to  as  the  j-th  measurement  axis  in  parity  space.) 

It  is  now  clear  that  A  should  be  chosen  to  have  distinct  columns,  so  that  a  sensor 
failure  can  be  inferred  from  the  presence  of  a  residual  bias  in  its  corresponding  FDPS. 
(Note  that  an  A  suitable  for  the  voting  scheme  has  M  distinct  columns.)  In  principle, 
an  A  with  as  few  as  two  rows  but  M  distinct  columns  is  sufficient  for  detecting  and 
identifying  a  failure  among  the  M  sensors.  In  practice,  however,  increasing  the  row 
dimension  of  A  can  help  to  separate  the  various  FDPS's  and  increase  the 
distinguishability  of  the  different  failures  under  noisy  conditions. 

The  two  FDI  methods  discussed  above  can  also  be  used  with  temporal  redundancy. 
In  a  voting  scheme,  one  can  see  that  the  same  logic  applies.  (In  fact,  additional  self¬ 
tests  may  be  performed  for  the  sensors  providing  corroboratory  information  which  is 
of  great  value  when  noise  is  present.)  Consider  next  the  extention  of  the  second 
failure  detection  method  to  temporal  redundancy  relations.  In  this  case,  it  is  generally 
not  possible  to  And  an  A  to  confine  the  effect  of  each  component  failure  to  a  fixed 
direction  in  parity  space.  To  see  this,  consider  the  parity  relations  (16).  We  can  write 
the  parity  vector  as 


POO- 

1  ”(*11+822)  *11*22  ®  0 

1  -a12  0  0  -al2 

Y\  00 
yj(k-l) 
y»  (k”2) 

+ 

0  -a, 2 
0  0 

u(k-l) 

u(k-2) 

0  0  0  1  -a22 

y200 

y2<k-i) 

1  0 
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When  sensor  2  falls  (with  output  model  (23)),  the  residual  vector  develops  a  bias  of 
the  form 

[0  0 

P(k)  -  0  »(k)  +  -aI2  v(k-l) 

111  -a22  (24) 

Unless  v(k)  is  a  constant,  the  effect  (signature)  of  a  sensor  2  failure  is  only  confined 
to  a  two-dimensional  subspace  of  the  parity  space.  In  fact,  generally  when  temporal 
redundancy  is  used  in  the  parity  function  method  for  residual  generation,  failure 
signatures  are  generally  constrained  to  multi-dimensional  subspace  in  the  parity  space. 
These  subspaces  may  in  general  overlap  with  one  another,  or  some  may  be  contained 
in  others.  If  no  such  subspace  is  contained  in  another,  identification  of  the  failure  is 
still  possible  by  determining  which  subspace  the  residual  bias  lies  in.  We  note  that  the 
detection  filters  of  Beard  [2]  and  Jones  [3]  effectively  acts,  in  a  closed-loop  fashion,  to 
confine  the  signature  of  an  actuator  failure  to  a  single  direction  and  that  of  a  sensor 
failure  to  a  two-dimensional  subspace  in  the  residual  space. 

As  we  indicated  previouly,  the  second  approach  to  using  parity  functions  for  FDI 
uses  some  information  about  the  nature  of  the  failure  signatures.  Specifically,  it  uses 
information  concerning  the  subspaces  in  which  the  signatures  evolve.  In  this  approach 
no  attempt  is  made  to  use  any  information  concerning  the  temporal  structure  of  this 
evolution.  (For  example,  no  assumption  was  made  about  the  evolution  of  v(k)  in 
(24).)  In  some  problems  (e.g.  in  (6,101)  one  may  be  able  to  model  the  evolution  of 
failures  as  a  function  of  time.  In  this  case,  the  temporal  signature  of  the  failure  (in 
addition  to  the  subspace  information  discussed  above)  can  be  determined.  (If,  for 
instance,  v(k)  in  (23)  is  modelled  in  a  particular  way,  then  one  immediately  obtains  a 


•model  of  the  evolution  of  P(k)  in  (24).)  Such  information  can  be  of  further  help  in 
distinguishing  the  various  failures,  especially  in  the  case  where  temporal  redundancy  is 
used.  Detection  systems  such  as  GLR  14,5,6]  heavily  exploit  such  information 


contained  in  the  residual. 
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IV.  PARITY  RELATIONS  FOB  ROBUST  RESIDUAL  GENERATION 

III  this  section  we  discuss  the  issue  of  robust  failure  detection  in  terms  of  the 
notions  introduced  in  the  previous  section.  The  need  for  this  development  comes 
from  the  obvious  fact  that  in  any  application  a  deterministic  model  such  as  (1)  is  quite 
idealistic.  In  particular,  the  true  system  will  be  subjected  to  noise  and  parameter 
uncertainty.  If  noise  alone  were  present  one  could  take  this  into  account,  as  we  have 
indicated,  through  the  design  of  a  statistical  test  based  on  the  generated  residuals  (see, 
for  example,  [4,10]).  However,  the  question  of  developing  a  methodology  for  FDI 
that  also  takes  parameter  uncertainty  into  account  has  not  been  treated  in  the 
literature  previouly.  It  is  this  problem  we  address  here. 

The  starting  point  of  our  development  is  a  mode)  that  has  the  same  form  as  (1) 
but  includes  noise  disturbance  and  parameter  uncertainty  : 

x(k+l)  -  A(y)  x(k)  +  ^bj(y)Uj(k)  +  f(k) 

(25a) 


yj(k)  -  cjx(k)  +  T)j(k) 

(25b) 

where  y  is  the  vector  of  uncertain  parameters  taking  values  in  a  specified  subset  T  of 
R®.  This  form  allows  the  modelling  of  elements  in  the  system  matrices  as  uncertain 
quantities  that  may  be  functions  of  a  common  quantity.  The  vectors  f  and 
'  '  1  .iim]'  are  independent,  zero-mean,  white  Gaussian  noise  vectors  with 
constant  covariance  matrices  Q(^0)  and  R(>0)  respectively.  In  this  section  we 
consider  the  problem  of  determining  useful  parity  relations  that  can  be  used  for  FDI 
for  the  system  described  by  (25). 
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The  Structure  and  Coefficients  of  a  Parity  Function 


Before  we  continue  with  the  discussion,  it  is  useful  to  define  the  structure  and  the 
coefficients  of  a  parity  function.  Recall  that  a  parity  function  is  essentially  a  weighted 
combination  of  a  (time)  window  of  sensor  outputs  and  actuator  inputs.  The  structure 
of  a  parity  function  defines  which  input  and  output  elements  are  included  in  this 
window,  and  the  coefficients  are  the  (nonzero)  weights  corresponding  to  these 
elements.  A  scalar  parity  function,  p(k),  can  be  written  as 


p(k)  -  aY  (k)  +  fiXJ  (k) 

(26) 

where  T  (k)  and  U  (k)  denote  the  vectors  containing  the  output  and  input  elements  in 
the  parity  function,  respectively.  Together,  Y(k)  and  U(k)  specify  the  parity 
structure,  and  the  row  vectors  a  and  /9  contain  the  parity  coefficients.  Consider,  for 
example,  the  first  parity  function  of  (16).  Its  corresponding  Y  (*),  U(k),  a,  and  fi 
are 

Y(k)  -  l y, (k — 2) ,  y, (k — 1) .  yj(k)  1' 


U(k)  -  u(k-2) 


a  “  [at)a22,  —  (ajj+a22),  1] 


fi  -  -a12 


Under  model  (25),  Y  (k)  has  the  form 

Y(k)  -  C(y)  x(k-p)  +  <My)f  (k)  +  B(y) U(k)  +  ijOO 


(27) 
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where  p  is  the  order  of  the  parity  function,  and 

*(k)-[f'(k-p) . «4(k-l)l* 

The  components  of  i}( k)  and  U(k),  and  the  rows  of  C(y),  <My),  and  B  are 
determined  from  (25)  and  the  structure  of  Y(k).  If,  specifically,  the  i-th  component 
of  Y(k)  is  y$(k-<r),  then  the  i-th  component  of  -ij(k)  is 

■ij,(k)  -  Tj$(k-or) 

The  vectors  £  and  f)  are  independent  zero-mean  Gaussian  random  sequences  with 
constant  covariances  Q  and  R,  respectively  The  matrix  Q  is  block  diagonal  with  Q  on 
the  diagonal;  Rg-R,,^,,  where  Rg  is  the  (ij)-th  element  of  R,  8,,.  is  the  Kronecker 
delta  function,  R*  is  the  (s.t)-th  element  of  R,  and  the  ith  element  of  Y(k)  is 
y#(k— <r),  while  the  jth  element  is  y,(k-r).  The  i-th  row  of  C(y),  i.e.  C(i,y)  is 

C(i,y)  -  SA'-' 

The  i-th  row,  <h(i,y),  of  <h(y)  (which  has  pN  columns)  is 

<Mi,y)  -  IcjA^-1,  CjA^-'7”2,  .  . .  ,  c„  0 . 0] 

Note  that  x(k-p)  is  a  random  vector  that  is  uncorrelated  with  l  and  rj,  and 

E{x(k-p)|  -  x0(k-p) 

covjx(k-p)}  -  2(y) 

where  2(y)  is  the  (steady  state)  covariance  of  x(k-p)  and  it  is  dependent  on  y 
through  A(y)  and  B(y). 

The  matrix  B  and  the  vector  U(k)  are  determined  as  follows.  First,  collect  into  a 
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matrix  B  ail  the  rows  in  Bj(k,y)  (see  Equation  (9))  corresponding  to  C(i,y).  Then, 
collect  all  the  non-zero  columns  of  B  into  B  and  the  corresponding  components  of  u 
in  the  window  into  U  (k) . 

In  the  proceeding  section,  we  defined  parity  functions  as  linear  combinations  of 
inputs  and  outputs  that  would  be  identically  zero  in  the  absence  of  noise.  When 
parameter  uncertainties  are  included,  however,  it  is  not  possible  in  general  to  find  any 
parity  functions  in  this  narrow  sense.  In  particular,  with  reference  to  the  function 
p(k)  defined  by  (26)  and  (27)  this  condition  would  require  that  <xC(y)-0  for  all 
yCT.  Consequently,  we  must  modify  our  notion  of  a  useful  parity  relation. 
Intuitively,  any  given  parity  structure  will  be  useful  for  failure  detection  if  we  can  find 
a  set  of  parity  coefficients  that  will  make  the  resulting  function  p(k)  in  (26)  close  to 
zero  for  all  values  of  yCT  when  no  failure  has  occurred.  When  considering  the  use 
of  such  a  function  for  the  detection  of  a  particular  failure  one  would  also  want  to 
guaranty  that  p(k)  deviates  significantly  from  zero  for  all  y  C  T  when  this  failure 
occurs.  Such  a  parity  structure-coefficient  combination  approximates  the  true  parity 
function  defined  in  Section  II.  Our  approach  to  the  robustness  issue  is  founded  on 
this  perspective  of  the  FDI  design  problem,  and  we  will  choose  parity  structures  and 
coefficients  that  display  these  properties.  From  this  vantage  point,  it  is  not  neccessary 
to  base  a  parity  structure  on  a  C  with  linearly  dependent  rows.  Of  course,  the  closer 
the  rows  of  C  are  to  being  dependent  the  less  the  value  of  the  state  x(k-p)  will  affect 
the  value  of  the  approximate  parity  function,  i.e.  the  the  closer  the  approximate  parity 
function  is  to  being  a  true  parity  function. 

Determination  of  Parity  Structure  and  Coefficients 
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Clearly,  there  are  many  candidate  parity  structures  for  a  given  system.  For  a 
voting  system,  the  requirements  on  ft  as  described  in  Section  in  help  to  limit  the 
number  of  such  candidates  that  must  be  considered.  In  addition  special  features  of 
the  system  under  consideration  typically  provide  additional  insights  into  the  choice  of 
candidate  parity  structures.  Given  the  set  of  candidate  structures  one  is  faced  with  the 
problem  of  finding  the  best  coefficients  for  each  and  then  with  comparing  the  resulting 
candidates.  In  this  paper  we  will  not  address  the  problem  of  defining  the  set  of 
candidate  structures  (as  this  is  very  much  a  system -specific  question)  but  will  assume 
that  we  have  such  a  set  of  structures* ,  and  we  will  proceed  to  consider  the  problem  of 
determining  the  coefficients  for  these  structures  and  their  comparison.  In  the 
following  we  will  describe  a  method  for  choosing  robust  parity  functions.  Although 
this  approach  represents  only  one  method  of  solving  the  problem,  it  serves  well  to 
illustrate  the  basic  ideas  of  a  useful  design  methodology. 

The  parity  function  design  problem  is  approached  in  two  steps  :  1)  coefficients  that 
will  make  the  candidate  parity  functions  close  to  zero  under  the  no-fail  situation  are 
determined,  2)  the  resulting  parity  functions  that  provide  the  most  prominant  failure 
signatures  for  a  specified  failure  will  be  chosen.  We  will  consider  the  coefficient 
design  problem  first. 

We  are  concerned  with  the  choice  of  the  coefficients,  a  and  ft  for  the  parity 
function 

p(k)  -  «[C(y)x(k-p)  +  <*(y)£(k)  +  B(y)U(k)  +  ^(k)l  -  0U(k) 

Note  the  dependence  of  p(k)  on  «,  /3,  y,  x(k-p),  and  U(k).  As  p(k)  is  a  random 
*  This  set  could  be  all  structures  up  to  s  specified  order,  which  is  a  Aoite  set. 
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variable,  a  convenient  measure  of  the  magnitude  (squared)  of  p(k)  is  its  variance, 
E|  p*(k)  where  the  expectation  is  taken  with  respect  to  the  joint  probability  density  of 
x(k-p),  {(k),  and  ^(k)  with  the  mean  x0(k-p)  and  the  value  of  U(k)  assumed 
known.  As  we  will  discuss  shortly,  this  can  be  thought  of  as  specifying  a  particular 
operating  condition  for  the  system.  Note  also  that  the  statistics  of  x(k-p)  depend  on 
y.  Define 

e(a,/3)  -  max  E{p2(k)J 

*cr  (28) 

The  quantity  t(a,fi)  represents  the  worst  case  effect  of  noise  and  model  uncertainty 

on  the  parity  function  p(k)  and  is  called  the  parity  error  for  p(k)  with  the  coefficients  a 

and  /3.  We  can  attempt  to  achieve  a  conservative  choice  of  the  parity  coefficients  by 

solving 

min  t(a,fi) 

Since  it  has  a  trivial  solution  (a-0,  /HO)  this  optimization  problem  has  to  be 
modified  in  order  to  give  a  meaningful  solution.  Recall  that  a  parity  equation 
primarily  relates  the  sensor  outputs,  i.e.  a  parity  equation  always  include  output  terms 
but  not  necessarily  input  terms.  Therefore,  a  must  nonzero.  Without  loss  of 
generality,  we  can  restrict  a  to  have  unit  magnitude.  The  actuator  input  terms  in  a 
parity  relation  may  be  regarded  as  serving  to  make  the  parity  function  zero  so  that  fi  is 
nominally  free.  In  fact,  fi  has  only  a  single  degree  of  freedom.  Any  fi  can  be  written 
as/3  -XU'(k)+z',  where  z  is  a  (column)  vector  orthogonal  to  U(k).  The  component  z' 
in  fi  will  not  produce  any  effect  on  p(k).  This  implies  for  each  U(k)  we  only  have  to 
consider  fi  of  the  form  /HAU’(k),  and  we  have  the  following  minimax  problem 
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min  max  E{pJ(k)} 

rCT  (29) 

Um'-I 

where 

E{ p2(k) }  -  [a,X]S[a,Xl' 

end  S  it  the  symmetric  positive-definite  matrix 

Sn  S,2 
S  “  Sj|  S22 

S,,  -  C(y)  [x0(k-p)x0'(k-p)  +  2(y)  1  C*(y)  + 

♦  <y)Q*'(y)  +R  +  B(y)U(k)U'(k)B‘(y)  + 

C(y) xc(k-p) U'(k) B'(y)  *  B(y)U(k)  x0'(k)C'(y) 

S„  -  hi “  -SjiJ(B(y)U(k)+C(y)x0(k-p)l 

Sj2-  [U'(k)  U(k)  ] 

Let  «*  and  X*  denote  the  values  of  a  and  X  that  solve  (29),  with  /8*-X*U‘(k).  Let 
e*  be  the  minimax  parity  error  of  (29),  i.e.  e*-e(a\/3*).  Then  e*  is  the  parity  error 
corresponding  to  the  parity  function  p*(k)-«*Y(k)+/8*U(k).  The  quantity  e 
measures  the  usefulness  of  p*(k)  as  a  parity  function  around  the  operating  point 
specified  by  x0(k-p)  and  U(k). 

Although  the  objective  function  of  (29)  is  quadratic  in  a  and  X,  (29)  is  generally 
very  difficult  to  solve,  because  S  may  depend  on  y  arbitrarily.  (See  [16]  and  the  next 
section  for  a  discussion  of  the  solution  to  some  special  cases.)  The  dependence  on  y 
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can  b«  simplified  somewhat  by  the  following  approximation.  Recall  that  the  role  of  a 
parity  equation  is  to  relate  the  outputs  and  inputs  at  different  points  in  time.  The 
matrices  C,  <b,  and  B,  which  specify  the  dynamics  of  the  system,  thus  have  the 
dominant  effect  on  the  choice  of  a  parity  equation.  From  this  vantage  point  the 
primary  effect  of  the  uncertainty  in  y  is  typically  manifested  through  the  direct 
influence  of  these  matrices  on  the  matrix  S,  rather  than  through  the  indirect  effect 
they  have  on  2(y).  Said  another  way,  the  variation  in  S  as  a  function  of  y  is 
dominated  by  the  terms  involving  C,  <b,  and  B,  and  in  this  case  one  introduces  only  a 
minor  approximation  by  replacing  2(y)  by  a  constant  2.  This  is  equivalent  to 
assuming  the  likely  variations  in  the  state  do  not  change  as  a  function  of  y.  With  this 
approximation  the  S  matrix  shown  above  can  be  simplified,  and  we  will  use  this 
approximation  throughout  the  remainder  of  the  paper. 

Note  that  the  dependence  of  e(a,£)  on  x0(k-p)  and  U(k)  indicates  that  the 
coefficients  in  principle  should  be  computed  at  each  time  step  if  x0(k-p)  and  U(k)  are 
changing  with  time.  This  is  clearly  an  undesirable  requirement.  Typically,  a  set  of 
coefficients  will  work  well  for  a  range  of  values  of  x0(k-p)  and  U(k).  Therefore,  a 
practical  approach  is  to  schedule  the  coefficients  according  to  the  operating  condition. 
Each  operating  condition  may  be  treated  as  a  set-point,  which  is  characterized  by  some 
nominal  state  and  input  U0  that  are  independent  of  time.  Parity  coefficients  can  be 
precomputed  (by  solving  (29)  with  x0  and  U0  in  place  of  x0(k-p)  and  U(k))  and 
stored.  Then  the  appropriate  coefficients  can  be  retrieved  for  use  at  the  corresponding 
set-point.  When  the  state  and  the  input  are  varying  slowly,  this  scheme  of  scheduling 
coefficients  is  likely  to  deliver  performance  close  to  the  optimum. 
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If  a  more  «ocurat«  approximation  is  desired,  the  coefficients  scheduling  scheme 
described  above  can  be  modified  to  account  for  variations  in  the  input  due,  for 
example,  to  regulation  of  the  system  at  the  set  point  x0.  In  particular,  one  can 
consider  modelling  U(k)-U0+*U(k),  where  SU(k)  is  a  (stationary)  zero-mean 
random  process  that  models  the  deviation  of  the  input  from  the  nominal  U0.  With 
this  modification,  the  expectation  of  p3  (k)  has  to  be  taken  with  respect  to  the  joint 
probability  density  of  x(k-p),  {(k),  Tj(k),  and  8U(k)  with  x0  and  U0  fixed.  This  will 
lead  to  a  more  complex  S  matrix.  Furthermore,  the  vector  fi  will  no  longer  be 
constrained  but  completely  free.  However,  the  general  form  of  the  optimization 
problem  remains  unchanged. 


Another  approach  to  circumvent  the  requirement  of  solving  the  coefficient  design 
problem  for  many  values  of  x0  and  U0  is  to  modify  (29)  to  be 


min  max 
ycr 

W(k)«Y 


Ejp^k)) 


(30) 


where  X  and  Y  denote  the  ranges  of  values  that  xc(k)  and  U(k)  may  take, 
respectively.  This  formulation  leads  to  a  single  parity  function  over  all  operating 
conditions.  We  will  not  explore  this  approach  here,  but  refer  the  reader  to  [1 7] . 
Whether  this  alternative  approach  or  our  coefficient-scheduling  method  is  more 
appropriate  depends  on  the  problem.  If  the  state  and  control  are  likely  to  vary 
significantly  and  if  is  not  that  strong  a  function  of  x0  and  Ue,  the  alternative 

approach  would  be  appropriate.  If  however  the  state  and  control  are  likely  to  be  near 
specific  set  points  for  periods  of  time,  then  using  a  parity  function  matched  to  that 
condition  would  yield  superior  performance. 
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With  the  coefficients  and  the  associated  parity  errors  determined  for  the  candidate 
parity  structures  we  can  proceed  to  choose  the  parity  functions  for  residual  generation 
using  the  parity  function  method.  As  the  squared  magnitude  of  the  coefficients  [ajf\ 
scales  the  parity  error,  the  parity  errors  of  different  parity  functions  can  be  compared 
if  they  are  normalized.  We  define  the  normalized  parity  error ,  e*,  the  normaUzed  parity 
coefficients,  and  the  normalized  parity  function,  p*(k),  as  follows 

e’  -e7* 

«*  -  a/9 

fi-fi'/9 

P*(k)  -  a*Y(k)  -  jTU(k) 

where 

The  parity  functions  with  the  smallest  normalized  parity  errors  are  preferred  as  they 
are  closer  to  being  true  parity  functions  under  noise  and  model  uncertainty,  i.e.  they 
are  least  sensitive  to  these  adverse  effects. 

An  additional  consideration  required  for  choosing  parity  functions  for  residual 
generation  is  that  the  chosen  parity  functions  should  provide  the  largest  failure 
signatures  in  the  residuals  relative  to  the  inherent  parity  errors  resulting  from  noise 
and  parameter  uncertainty.  A  useful  index  for  comparing  parity  functions  for  this 
purpose  is  the  signature  to  parity  error  ratio ,  *,  which  is  the  ratio  between  the 
magnitudes  of  the  failure  signature  and  the  parity  error.  Using  g  to  denote  the  effect 
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of  a  failure  on  the  parity  function,  »  can  be  defined  as 

For  the  detection  and  identification  of  a  particular  failure,  the  parity  function  that 
produces  the  largest  *  should  be  used  for  residual  generation.  We  give  an  example  of 
this  procedure  in  the  next  section. 

Discussions 

Since  a  large  signature  to  parity  error  ratio  is  desirable,  a  logical  alternative 
approach  to  the  choice  of  parity  structure  and  coefficients  is  to  consider  the  signature 
to  parity  error  ratio  as  the  objective  function  in  the  minimax  design.  Although  this  is 
a  more  direct  way  to  achieve  the  design  goal,  it  requires  solving  a  more  difficult 
optimization  problem  than  (29).  The  method  described  above  and  the  example  in  the 
next  section  take  advantage  of  the  comparatively  simple  optimization  problem  to 
illustrate  the  essential  idea  of  bow  to  determine  redundancy  relations  that  are  least 
vulnerable  to  noise  and  model  errors.  For  different  residual  generation  methods  the 
measures  of  usefulness  of  parity  functions,  such  as  e  and  w  in  the  above,  may  be 
different,  but  the  basic  design  concept  illustrated  here  still  applies. 

The  minimax  problem  (29)  can  be  replaced  by  a  maximization  if  a  probability 
density  for  the  parameter  y  can  be  postulated.  That  is,  the  design  problem  now  takes 
the  form 

max  E{p2(k)} 

1 


where  the  expectation  of  p2(k)  is  taken  with  respect  to  the  joint  density  of  x,  (,  rj,  and 
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y.  This  formulation  will  give  a  much  simpler  optimization  to  be  solved  practically 
than  the  minimax  approach. 


-  38  - 


▼.  A  NUMTOCAL  IXAMPLI 

In  this  section  we  consider  the  problem  of  choosing  parity  functions  and  parity 
coefficients  for  a  4 -dimensional  system  operating  at  a  set-point  with  two  actuators  and 
three  sensors.  The  system  matrices  are  shown  in  Table  1.  Except  for  two  elements  in 
the  A  matrix  all  parameters  are  known  exactly.  These  two  elements  are  assumed  to 
be  independent  parameters  denoted  by  yx  and  y2. 

Suppose  we  want  to  design  a  voting  system  for  detecting  a  sensor  failure.  Three 
candidate  parity  structures  are 


P!  (k)  -  a  | 


y2(k-i) 

y2<k) 

y,(k-l) 


Pj(k)  -  ot2 


y2(k-2) 
y,(k-2) 
yi  (k— i) 
yi(k) 


P3  (k) 


«3 


y3(k-l) 
y3(k) 
y,  (k—l) 


where  the  «j’s  are  row  vectors  (of  parity  coefficients)  of  appropriate  dimensions.  The 
corresponding  ♦  and  C  matrices  are  shown  in  Table  2.  Note  that  each  C  and  <h 
matrix  depends  linearly  on  either  yx  or  y2  and  that  the  rows  of  C2  are  not  linearly 
dependent  for  any  value  of  y2.  The  parity  structures  under  consideration  do  not 
contain  any  actuator  terms  due  to  the  fact  that  CjB,  c2B,  c2AB,  and  c3B  are  all  zero. 
This  will  simplify  the  solution  of  the  minimax  problem  without  severely  restricting  the 
discussion.  Assuming  a  single  sensor  may  fail,  only  p3  plus  p3  or  p2  need  to  be  used 
for  residual  generation  (because  both  P]  and  p2  include  sensors  1  and  2) .  Therefore, 
in  addition  to  the  coefficient  design  problem,  we  have  to  rank  the  two  parity  structures 
Pi  and  Pi  in  order  to  determine  which  will  give  more  robust  residuals. 
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The  minimax  design  problem  has  been  solved  for  a  set  of  six  test  conditions 
consisting  of  different  set-points  and  different  plant  and  sensor  noise  intensities. 
These  test  conditions  are  described  in  Table  3.  (The  two  set-points  are  obtained  by 
applying  ut-l  or  u2-10  to  the  nominal  system  model.)  The  nominal  state 
covariances  2t  and  X2  due  to  the  two  different  plant  noise  intensities  Q]  and  Q2  are 
listed  in  Table  4.  Due  to  the  simple  dependence  of  the  parity  functions  on  the  y's  an 
efficient  solution  procedure  is  possible  [16].  The  resulting  parity  coefficients  and  the 
corresponding  (normalized)  parity  errors  are  summarized  in  Table  5. 

It  is  evident  that  the  parity  coefficients  in  this  example  are  strongly  dependent  on 
the  test  condition  (i.e.  the  values  of  x0,  Q,  and  R).  Although  this  dependence  is  very 
complex,  some  insights  may  be  obtained  from  the  numerical  results.  Consider,  for 
instance,  Pi  under  conditions  b  and  c.  For  condition  b  the  parity  function  is 

p,b(k)  -  .6411  y2(k— 1)  -  ,7666 y2(k)  +  .0378y,(k-l) 
and  for  condition  c  it  is 

Plc(k)  -  .8947 y2(k-l)  -  .3667 y2(k)  -  .2551  y,(k-l) 

The  only  difference  between  these  conditions  lies  in  the  value  of  x0.  Since  the  first 
and  fourth  columns  of  Ct  are  zero,  only  the  second  and  third  elements  of  x0  (xo2  and 
xo3)  will  play  a  role  in  the  coefficient  optimizaton  problem.  The  parity  function  pj  can 
be  written  in  the  form 

Pi  “  «llxo2  +  al2^xo2+y|xo3)  +  aIJxoJ  + 

where  «|if  i- 1,2,3  denote  the  elements  of  at  corresponding  to  y2(k-l),  y2(k),  and 
yi(k-I),  respectively;  £  denotes  the  remaining  noise  terms.  It  is  dear  that  xo3  and 
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aJ2  modulates  the  effect  of  yx  on  pj.  Qualitatively,  as  |xo3|  becomes  large  relative  to 
|Xq2 I  (with  all  noise  covariances  the  same),  the  optimal  a!2  will  reduce  in  size  (relative 
to  «fu  and  <*,3)  in  order  to  keep  the  effect  of  yx  small.  As  |xo3j  increases,  the  signal  to 
noise  ratio  of  yj  (k)  also  increases.  Therefore,  we  expect  ja13|  to  become  large  to  take 
advantage  of  the  information  provided  by  y3 (k).  Under  condition  b,  xo2>xo3,  and 
under  condition  c  the  reverse  is  true.  An  inspection  of  p3  under  these  condition  as 
listed  above  shows  that  this  reasoning  holds.  Therefore,  built  into  the  minimax 
problem  is  a  systematic  way  of  handling  the  tradeoff  between  uncertainty  effects  due 
to  noise  and  error  in  system  parameters. 

Note  that  both  pt  and  p2  relate  the  first  sensor  to  the  second  one,  and  p2  is  a 
higher  order  parity  function  than  P].  Furthermore,  the  rows  of  C2  are  not  linearly 
dependent  for  any  value  of  y2.  However,  the  parity  eiror  associated  with  p2  is  smaller 
than  that  of  P]  in  all  conditions  except  condition  a.  This  shows  that  a  higher  order 
parity  relation  (which  is  more  likely  to  contain  higher  order  effects  of  y)  is  not 
necessarily  more  vulnerable  to  model  errors  and  noise.  In  addition,  a  parity  function 
based  on  a  C  matrix  with  rows  that  are  linearly  dependent  for  all  values  of  y  does  not 
necessarily  produce  a  smaller  parity  error  than  a  parity  function  that  is  based  on  a  C 
with  independent  rows. 

In  Table.  6  we  have  tabulated  the  signature  to  parity  error  ratio  associated  with  the 
three  parity  functions  for  sensor  failures  that  are  modelled  by  a  constant  bias  of  size  vx 
in  the  output  for  test  conditions  c  and  d.  Here,  ir,  denotes  the  signature  to  parity 
error  ratio  for  a  bias  failure  in  sensor  i,  and  it  is  calculated  by  substituting  »<j  for  y,  in 
the  parity  function  (26)  with  the  minimax  coefficients.  Such  a  table  is  helpful  for 
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determining  the  relative  merits  of  Pj  and  p2.  For  instance,  under  condition  d  and 
assuming  p2  is  preferred  to  P|  because  it  has  a  larger  value  of  »,  than  p2  while 

its  «r2  value  is  comparable  to  that  of  p2. 
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VL  CONCLUSIONS 

la  this  paper  we  have  characterized  the  notion  of  analytical  redundancy  in  terms  of 
a  generalized  parity  space.  We  have  described  three  methods  for  using  parity  relations 
to  generate  residuals  for  FDI.  The  problem  of  determining  robust  parity  relations  for 
residual  generation  using  the  parity  function  method  was  studied.  This  design  task 
was  formulated  as  an  optimization  problem,  and  an  example  was  presented  to 
illustrate  the  design  methodology.  A  number  of  problem  areas  await  further  research. 
They  include  :  a  method  for  selecting  useful  parity  structures  for  the  parity  coefficient 
problem  studied  in  Section  IV,  solution  procedures  for  the  (minimax)  optimization 
problem,  and  a  method  for  determining  parity  relations  for  other  methods  of  residual 
generation  (i.e.  the  open-loop  and  the  closed-loop  methods) . 
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TABLE  2:  THE  C  AND  ♦  MATRICES 
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TABLE3:  TEST  CONDITIONS 


